-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Templated COSA command execution #1739
Templated COSA command execution #1739
Conversation
Throwing this up as an early WIP to show the idea and see if there any interest in pursuing this. This would give us a way to translate the RHCOS Pipeline commands into GoLang templates for command execution and allow for easy migration. The RHCOS job spec isn't particularly loved per se, but its in production use for the ART and Development pipelines. The other thought I had was replacing |
The reason I went with GoLang over Python for this piece of code was because GoLang is way nicer about cleaning up processes and allowing concurrency for parallel tasks is trivial. I created it as it own thing instead of putting into Mantle directly since it doesn't fit neatly there. |
Would it make sense to just make the steps file a shell script instead? So essentially, |
I thought about using
Also, since this doesn't have terminal emulation, its not suitable for interactive developer hack-benches. To be sure, I can see developers entering cosa and running |
OK a whole lot going on here.
There are a lot of "execute these things in parallel" languages, among them the venerable |
Partly. The base idea came before #1688; the code idea never saw the light of day, though. I went ahead and dusted this off because I am angling to have COSA run as a build containers via OCP 👍 on having
I haven't really looked at Ninja, tbh. I can look. But I was avoiding |
Okay, after looking at #1739 (comment), I think that this approach is still valid. Using existing tools is fine, but The genesis to this PR comes from the realization that the real difference between the RHCOS and FCOS pipelines is the business logic. And when I looked deeply at the domain, it becomes clear that the bulk of the Jenkins logic amounts to rendering configuration into CLI arguments and setting up the environment. My position is that giving COSA a file interface will simplify the Jenkins pipelines significantly.
I agree with understanding dependencies, and I've done some work in that regard. But, I think dependency tracking is orthogonal to this idea; a dependency tree may negate the need to run "stages" but that doesn't negate the sugar of having templating. Integration with another tool would be a big undertaking, especially if we want a pretty CLI experience. However, integration for the Developer use-case might look different than integration for production use-cases (e.g. Jenkins, FedMsg, OpenShift pods). |
Hmm, I'm still not sure I follow why we need to fold in
Can you sketch out a bit more what the Jenkins pipeline would look like? It's worth noting that this is introducing two file interfaces: the spec file and the template file. I agree that a spec file would be useful (WDYT about calling this e.g. a property file instead so we don't confuse it with RPM spec files?), and agree that the differences between the pipelines mostly lie in the business logic, but I'm having trouble reconciling how templates fit into the current flow. IIUC, doesn't it imply that whenever we want to do something at the Jenkins layer (e.g. setting the job description, marking a new stage, spawning k8s worker pods for parallel work, starting other jobs, etc...) we would have to keep going in and out of template scripts? Or would all that stuff be obsolete in this new world? |
I'm backing away from the idea of replacing
The world that I envision is COSA-as-a-Service where COSA is an OpenShift BuildConfig. Rather than relying on Jenkins to setup the pod, Jenkins would execute Why? Right now the RHCOS pipelines are greedy. The entire pipeline consumes a lot of RAM and a lot of disk space for a build and is scheduled on the same pod. In the RHCOS pipeline, we require privileges, which requires to play funny games to get Now, why templating when OpenShift has templates? One of the lessons learned in the RHCOS pipelines was that we ended up polluting the namespaces with essentially the same thing with minor variations. The idea of using templates is to re-use the same buildconfigs based on versions (ala The reason that envVar's don't work well here is Jenkins right now set the envVars. And don't get me started on truthiness in envVar strings. The RHCOS pipeline's complicated nature comes from the processing of the envVar's. OpenShift templates are a real PITA to update when new options are needed. I originally came up with the idea of having COSA support envVar's for CLI commands; that idea excited people about as much as root canal. When when I looked at the BuildConfig spec, you don't have the The other big reason I wanted to move away from Jenkins as the pod scheduler is because of multi-arch. Right now we have several down-stream teams that take the RHCOS pipeline and then standup Jenkins and run parallel pipelines. The cost of standing up builders is expensive. By having the interfaces this provides, we can run COSA anywhere we can run a container (e.g.
I would not reconcile it, tbh. This is future looking. Templates give us nothing today. Consider a distributed build system on aarch64, x86_64, and ppc64el. To start a build in a remote pod via a
And for testing:
(A post-commit hook would fetch the artifacts out) Follow? Now, to @cgwalters's point about build dependencies (i.e why not have COSA understand the dependencies?) is that build dependencies assume that dependencies are in the same place. What if the dependency exists, but needs fetching?
Not sure I follow. I think what this will look like is a Jenkins file that looks like:
Where Jenkins will still give us the pretty UI, we'll still get the logs. Finally, the other angle that this gives us is something really slick for the Developer use case. One common pain-point (for those of us unlucky enough to be on Comcast/Spectrum/$EvilCable that only gets 25Mb up) is how miserable it is to upload to a cloud to test your changes, and that assumes you have all the secrets. By having COSA as a buildconfig with this interface, I want to bring a Imagine: Where
|
Okay, I think this is ready for review. At the high level, I'm electing to keep calling the JobSpec the "JobSpec"; we have prior art in the existing JobSpec, and in the existing pipeline. And at the end-of-the-day, I disagree that someone will confuse an RPM spec with Jobspec. Should we incept a generic build "properties" file, we can add a "property." Finally, as stated, this is a starting point for distributed builds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me but I'd like to see some some info put into a markdown doc so folks who see this for the first time get a sense of how to use it and/or what it is for.
entrypoint/spec/rhcos_jobspec.go
Outdated
@@ -0,0 +1,161 @@ | |||
package spec | |||
|
|||
type Aliyun struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: In the future I'd love to see these structs with go docs added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a little bit more than doc string 'em. I removed the Jenkins Specific pieces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general I don't understand why we aren't just templating a set -euo pipefail
shell script basically. The only slightly tricky thing there is to handle escaping arguments but...ah yeah we have shellquote.Join
vendored in mantle already.
BTW I created https://crates.io/crates/sh-inline to do something similar to this for Rust to use in ostree's test suite - it's not quite the same because that's more about inline whereas this seems more about having the templates maintained outside of the code, but some similar ideas.
// cosaDir is set during a complile setting. | ||
if cosaDir == "" { | ||
path, err := os.Getwd() | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...os.Getwd()
can only fail I think in situations where the current directory is deleted underneath the process...why are we trying to handle that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
being pendantic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
being pendantic
@darkmuggle To clarify do you mean that you're being pedantic with extra checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I'm being pedantic with the extra check. It doesn't hurt anything. I definitely wasn't saying that Mr. Walters was being pedantic in the review (which is one way to read my poorly formed sentence fragment.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! To clarify I did read it the first way initially.
But my higher level uncertanty isn't checking for errors - that's always good IMO (now that I look I phrased this badly) so I more meant why are we silently falling back instead of propagating the error? A whole lot of tools out there will fail if their current working directory is deleted - usually this is a user error.
IOW why not just:
if cosaDir == {
path, err := os.Getwd()
if err != nil {
return err
}
Not a big deal obviously, just one of those detail things that can be good to have a rough consensus on since the pattern tends to reoccur elsewhere.
"time" | ||
) | ||
|
||
func RemoveZombies(ctx context.Context, wg *sync.WaitGroup) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally be more in favor of adding an assertion that we're not pid 1 (i.e. require that the caller has used --init
or dumb-init
in Kubernetes) and documenting how to do that rather than also carrying our own init. But not a blocker.
} | ||
|
||
// Wait for command to exit | ||
err = cmd.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this "start + wait + exit" pattern I think should always raise the question of why we're not just directly exec()
-ing the child process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run-steps
can run N files; we wait to see if that the command is done.
Forgot to add though I definitely agree with the argument that we should move some "high level steps" into cosa itself. We have a tension here between logic shared in groovy (e.g. https://github.com/coreos/coreos-ci-lib/ ) and this is effectively arguing for draining some of the groovy - which I think makes sense. It'd probably help a lot to understand this PR if e.g. it also added some steps/templates and then switched the |
This provides template support for COSA commands with the goal of having COSA understand the RHCOS jobspec. The goal of this work is to provide templated commands. For example: cosa entry --spec rhcos.yaml run cosa init "{{ .Recipe.GitUrl}}" Or: cosa entry --spec rhcos.yaml run-steps rhcos.steps Where rhcos.steps has: cosa init "{{ .Recipe.GitUrl }}" cosa fetch cosa build The alternative entry is used when the envVar `COSA_CI_ENTRY=1`.
Signed-off-by: Ben Howard <[email protected]>
To be blunt: this is an understatement. My core argument is that the Groovy is actually a liability on the RHCOS side. I think I make the case for what I'm arguing for and why in the added README.md -- and the vision I want to push. |
@@ -0,0 +1,145 @@ | |||
# Entrypoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fantastic write up! One nit which I do not think is worth looking at in this PR itself is I believe it would be best to have the "what and how to use" up front followed by the "why". I believe more people will be interested in what it is and how to use it before they end up asking what brought us from where we are today to this pattern.
Signed-off-by: Ben Howard <[email protected]>
This overall looks sane to me. I still don't quite understand why this isn't just another /approve |
@jlebon I hope that the next PR will help to answer the Regarding making this another Mantle command, I did consider it -- and I even considered a shared vendor directory. There is some disagreement about the vendoring approach. Mantle has a clear purpose in code in doing $THING with an artifact (run, test, publish), while this new command is more meta about running COSA itself. And I did ask two other people about whether I should put this in Mantle; the consensus was no. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
This can't really break anything right now, so I'm fine moving forward and addressing more things as followups. In particular I think the time where we try to replace parts of the pipeline should be revealing.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ashcrow, cgwalters, darkmuggle, jlebon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This provides template support for COSA commands with the goal of having
COSA understand the RHCOS jobspec. The goal of this work is to provide
templated commands. For example:
cosa entry --spec rhcos.yaml run cosa init "{{ .Recipe.GitUrl}}"
Or:
cosa entry --spec rhcos.yaml runSteps --steps rhcos.steps
Where rhcos.steps has:
The way I envision this being implements is via a container where the JobSpec is provided via ConfigMap: